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WebRTC Video Processing and Codec Requirements 
Abstract 
This specification provides the requirements and considerations for 
WebRTC applications to send and receive video across a network. It 
specifies the video processing that is required as well as video 
codecs and their parameters. 
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1. Introduction 


One of the major functions of WebRTC endpoints is the ability to send 
and receive interactive video. The video might come from a camera, a 
screen recording, a stored file, or some other source. This 
specification provides the requirements and considerations for WebRTC 
applications to send and receive video across a network. It 
specifies the video processing that is required as well as video 
codecs and their parameters. 


Note that this document only discusses those issues dealing with 
video-codec handling. Issues that are related to transport of media 
streams across the network are specified in [WebRTC-RTP-USAGE]. 


2. Terminology 
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
"SHOULD", “SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
document are to be interpreted as described in [RFC2119]. 
The following definitions are used in this document: 
O A WebRTC browser (also called a WebRTC User Agent or WebRTC UA) is 


something that conforms to both the protocol specification and the 
Javascript API (see [RTCWEB-OVERVIEW]). 
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o A WebRTC non-browser is something that conforms to the protocol 
specification, but it does not claim to implement the Javascript 
API. This can also be called a "WebRTC device" or "WebRTC native 
application". 


o A WebRTC endpoint is either a WebRTC browser or a WebRTC non- 
browser. It conforms to the protocol specification. 


o A WebRTC-compatible endpoint is an endpoint that is able to 
successfully communicate with a WebRTC endpoint but may fail to 
meet some requirements of a WebRTC endpoint. This may limit where 
in the network such an endpoint can be attached, or it may limit 
the security guarantees that it offers to others. It is not 
constrained by this specification; when it is mentioned at all, it 
is to note the implications on WebRTC-compatible endpoints of the 
requirements placed on WebRTC endpoints. 


These definitions are also found in [RTCWEB-OVERVIEW] and that 
document should be consulted for additional information. 


3. Pre- and Post-Processing 


This section provides guidance on pre- and post-processing of video 
streams. 


Unless specified otherwise by the Session Description Protocol (SDP) 
or codec, the color space SHOULD be sRGB [SRGB]. For clarity, this 
is the color space indicated by codepoint 1 from "ColourPrimaries" as 
defined in [IEC23001-8]. 


Unless specified otherwise by the SDP or codec, the video scan 
pattern for video codecs is Y’CbCr 4:2:0. 


3.1. Camera-Source Video 
This document imposes no normative requirements on camera capture; 
however, implementors are encouraged to take advantage of the 
following features, if feasible for their platform: 
o Automatic focus, if applicable for the camera in use 


o Automatic white balance 


o Automatic light-level control 
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o Dynamic frame rate for video capture based on actual encoding in 
use (e.g., if encoding at 15 fps due to bandwidth constraints, low 
light conditions, or application settings, the camera will ideally 
capture at 15 fps rather than a higher rate). 


3.2. Screen-Source Video 


If the video source is some portion of a computer screen (e.g., 
desktop or application sharing), then the considerations in this 
section also apply. 


Because screen-sourced video can change resolution (due to, e.g., 
window resizing and similar operations), WebRTC-video recipients MUST 
be prepared to handle midstream resolution changes in a way that 
preserves their utility. Precise handling (e.g., resizing the 
element a video is rendered in versus scaling down the received 
stream; decisions around letter/pillarboxing) is left to the 
discretion of the application. 


Note that the default video-scan format (Y’CbCr 4:2:0) is known to be 
less than optimal for the representation of screen content produced 
by most systems in use at the time of this document’s writing, which 
generally use RGB with at least 24 bits per sample. In the future, 
it may be advisable to use video codecs optimized for screen content 
for the representation of this type of content. 


Additionally, attention is drawn to the requirements in Section 5.2 
of [WebRTC-SEC-ARCH] and the considerations in Section 4.1.1. of 
[WebRTC-SEC] . 


4. Stream Orientation 


In some circumstances -- and notably those involving mobile devices 
—- the orientation of the camera may not match the orientation used 
by the encoder. Of more importance, the orientation may change over 
the course of a call, requiring the receiver to change the 
orientation in which it renders the stream. 


While the sender may elect to simply change the pre-encoding 
orientation of frames, this may not be practical or efficient (in 
particular, in cases where the interface to the camera returns pre- 
compressed video frames). Note that the potential for this behavior 
adds another set of circumstances under which the resolution of a 
screen might change in the middle of a video stream, in addition to 
those mentioned in Section 3.2. 
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To accommodate these circumstances, WebRTC implementations that can 
generate media in orientations other than the default MUST support 
generating the RO and R1 bits of the Coordination of Video 
Orientation (CVO) mechanism described in Section 7.4.5 of [TS26.114] 
and MUST send them for all orientations when the peer indicates 
support for the mechanism. They MAY support sending the other bits 
in the CVO extension, including the higher-resolution rotation bits. 
All implementations SHOULD support interpretation of the RO and R1 
bits and MAY support the other CVO bits. 


Further, some codecs support in-band signaling of orientation (for 
example, the SEI "Display Orientation" messages in H.264 and H.265 
[H265]). If CVO has been negotiated, then the sender MUST NOT make 
use of such codec-specific mechanisms. However, when support for CVO 
is not signaled in the SDP, then such implementations MAY make use of 
the codec-specific mechanisms instead. 


5. Mandatory-to-Implement Video Codec 


For the definitions of "WebRTC browser", "“WebRTC non-browser", and 
"WebRTC-compatible endpoint" as they are used in this section, please 
refer to Section 2. 


WebRTC Browsers MUST implement the VP8 video codec as described in 
[RFC6386] and H.264 Constrained Baseline as described in [H264]. 


WebRTC Non-Browsers that support transmitting and/or receiving video 
MUST implement the VP8 video codec as described in [RFC6386] and 
H.264 Constrained Baseline as described in [H264]. 


NOTE: To promote the use of non-royalty-bearing video codecs, 
participants in the RTCWEB working group, and any successor 
working groups in the IETF, intend to monitor the evolving 
licensing landscape as it pertains to the two mandatory-to- 
implement codecs. If compelling evidence arises that one of the 
codecs is available for use on a royalty-free basis, the working 
group plans to revisit the question of which codecs are required 
for Non-Browsers, with the intention being that the royalty-free 
codec will remain mandatory to implement and the other will become 
optional. 


These provisions apply to WebRTC Non-Browsers only. There is no 
plan to revisit the codecs required for WebRTC Browsers. 
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"WebRTC-compatible endpoints" are free to implement any video codecs 
they see fit. This follows logically from the definition of "WebRTC- 
compatible endpoint". It is, of course, advisable to implement at 
least one of the video codecs that is mandated for WebRTC browsers, 
and implementors are encouraged to do so. 


6. Codec-Specific Considerations 


SDP allows for codec-independent indication of preferred video 
resolutions using the mechanism described in [RFC6236]. WebRTC 
endpoints MAY send an "a=imageattr" attribute to indicate the maximum 
resolution they wish to receive. Senders SHOULD interpret and honor 
this attribute by limiting the encoded resolution to the indicated 
maximum size, as the receiver may not be capable of handling higher 
resolutions. 


Additionally, codecs may include codec-specific means of signaling 
maximum receiver abilities with regard to resolution, frame rate, and 
bitrate. 


Unless otherwise signaled in SDP, recipients of video streams MUST be 
able to decode video at a rate of at least 20 fps at a resolution of 
at least 320 pixels by 240 pixels. These values are selected based 
on the recommendations in [HSUP1]. 


Encoders are encouraged to support encoding media with at least the 
same resolution and frame rates cited above. 


6.1. VP8 


For the VP8 codec, defined in [RFC6386], endpoints MUST support the 
payload formats defined in [RFC7741]. 


In addition to the [RFC6236] mechanism, VP8 encoders MUST limit the 
streams they send to conform to the values indicated by receivers in 
the corresponding max-fr and max-fs SDP attributes. 


Unless otherwise signaled, implementations that use VP8 MUST encode 
and decode pixels with an implied 1:1 (square) aspect ratio. 


6.2. H.264 


For the [H264] codec, endpoints MUST support the payload formats 
defined in [RFC6184]. In addition, they MUST support Constrained 
Baseline Profile Level 1.2 and SHOULD support H.264 Constrained High 
Profile Level 1.3. 
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Implementations of the H.264 codec have utilized a wide variety of 
optional parameters. To improve interoperability, the following 
parameter settings are specified: 


packetization-mode: Packetization-mode 1 MUST be supported. Other 
modes MAY be negotiated and used. 


profile-level-id: Implementations MUST include this parameter within 
SDP and MUST interpret it when receiving it. 


max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br: 


These parameters allow the implementation to specify that they can 
support certain features of H.264 at higher rates and values than 
those signaled by their level (set with profile-level-id). 
Implementations MAY include these parameters in their SDP, but 
they SHOULD interpret them when receiving them, allowing them to 
send the highest quality of video possible. 


sprop-parameter-sets: H.264 allows sequence and picture information 
to be sent both in-band and out-of-band. WebRTC implementations 
MUST signal this information in-band. This means that WebRTC 
implementations MUST NOT include this parameter in the SDP they 
generate. 


H.264 codecs MAY send and MUST support proper interpretation of 
Supplemental Enhancement Information (SEI) "filler payload" and "full 
frame freeze" messages. The "full frame freeze" messages are used in 
video-switching MCUs, to ensure a stable decoded displayed picture 
while switching among various input streams. 


When the use of the video orientation (CVO) RTP header extension is 
not signaled as part of the SDP, H.264 implementations MAY send and 
SHOULD support proper interpretation of Display Orientation SEI 
messages. 


Implementations MAY send and act upon "User data registered by Rec. 
ITU-T T.35" and "User data unregistered" messages. Even if they do 
not act on them, implementations MUST be prepared to receive such 
messages without any ill effects. 


Unless otherwise signaled, implementations that use H.264 MUST encode 
and decode pixels with an implied 1:1 (Square) aspect ratio. 
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7. 


8. 


8. 


Security Considerations 


This specification does not introduce any new mechanisms or security 
concerns beyond what is in the other documents it references. In 
WebRTC, video is protected using Datagram Transport Layer Security 
(DTLS) / Secure Real-time Transport Protocol (SRTP). A complete 
discussion of the security considerations can be found in 
[WebRTC-SEC] and [WebRTC-SEC-ARCH]. Implementors should consider 
whether the use of variable bitrate video codecs are appropriate for 
their application, keeping in mind that the degree of inter-frame 
change (and, by inference, the amount of motion in the frame) may be 
deduced by an eavesdropper based on the video stream’s bitrate. 


Implementors making use of H.264 are also advised to take careful 
note of the "Security Considerations" section of [RFC6184], paying 
special regard to the normative requirement pertaining to SEI 
messages. 
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